NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Tactile Emotions: Multimodal Affective Captioning with Haptics Improves Narrative Engagement for d/Deaf and Hard-of-Hearing Viewers

https://doi.org/10.1145/3706598.3713304

de_Lacerda_Pataca, Caluã; Hassan, Saad; May, Lloyd; Olson, Michelle M; D'aurio, Toni; Peiris, Roshan L; Huenerfauth, Matt (April 2025, ACM)

This paper explores a multimodal approach for translating emotional cues present in speech, designed with Deaf and Hard-of-Hearing (dhh) individuals in mind. Prior work has focused on visual cues applied to captions, successfully conveying whether a speaker’s words have a negative or positive tone (valence), but with mixed results regarding the intensity (arousal) of these emotions. We propose a novel method using haptic feedback to communicate a speaker’s arousal levels through vibrations on a wrist-worn device. In a formative study with 16 dhh participants, we tested six haptic patterns and found that participants preferred single per-word vibrations at 75 Hz to encode arousal. In a follow-up study with 27 dhh participants, this pattern was paired with visual cues, and narrative engagement with audio-visual content was measured. Results indicate that combining haptics with visuals significantly increased engagement compared to a conventional captioning baseline and a visuals-only affective captioning style.
more » « less
Full Text Available
PACE: Participatory AI for Community Engagement

https://doi.org/10.1609/hcomp.v12i1.31610

Hassan, Saad; Asad, Syeda_Mah Noor; Eslami, Motahhare; Mattei, Nicholas; Culotta, Aron; Zimmerman, John (October 2024, Proceedings of the AAAI Conference on Human Computation and Crowdsourcing)

Public sector leverages artificial intelligence (AI) to enhance the efficiency, transparency, and accountability of civic operations and public services. This includes initiatives such as predictive waste management, facial recognition for identification, and advanced tools in the criminal justice system. While public-sector AI can improve efficiency and accountability, it also has the potential to perpetuate biases, infringe on privacy, and marginalize vulnerable groups. Responsible AI (RAI) research aims to address these concerns by focusing on fairness and equity through participatory AI. We invite researchers, community members, and public sector workers to collaborate on designing, developing, and deploying RAI systems that enhance public sector accountability and transparency. Key topics include raising awareness of AI's impact on the public sector, improving access to AI auditing tools, building public engagement capacity, fostering early community involvement to align AI innovations with public needs, and promoting accessible and inclusive participation in AI development. The workshop will feature two keynotes, two short paper sessions, and three discussion-oriented activities. Our goal is to create a platform for exchanging ideas and developing strategies to design community-engaged RAI systems while mitigating the potential harms of AI and maximizing its benefits in the public sector.
more » « less
Full Text Available
Caption Royale: Exploring the Design Space of Affective Captions from the Perspective of Deaf and Hard-of-Hearing Individuals

https://doi.org/10.1145/3613904.3642258

de_Lacerda_Pataca, Caluã; Hassan, Saad; Tinker, Nathan; Peiris, Roshan Lalintha; Huenerfauth, Matt (May 2024, ACM)

Affective captions employ visual typographic modulations to convey a speaker’s emotions, improving speech accessibility for Deaf and Hard-of-Hearing (dhh) individuals. However, the most effective visual modulations for expressing emotions remain uncertain. Bridging this gap, we ran three studies with 39 dhh participants, exploring the design space of affective captions, which include parameters like text color, boldness, size, and so on. Study 1 assessed preferences for nine of these styles, each conveying either valence or arousal separately. Study 2 combined Study 1’s top-performing styles and measured preferences for captions depicting both valence and arousal simultaneously. Participants outlined readability, minimal distraction, intuitiveness, and emotional clarity as key factors behind their choices. In Study 3, these factors and an emotion-recognition task were used to compare how Study 2’s winning styles performed versus a non-styled baseline. Based on our findings, we present the two best-performing styles as design recommendations for applications employing affective captions.
more » « less
Full Text Available
Exploring the Benefits and Applications of Video-Span Selection and Search for Real-Time Support in Sign Language Video Comprehension among ASL Learners

https://doi.org/10.1145/3690647

Hassan, Saad; de_Lacerda_Pataca, Caluã; Amin, Akhter Al; Nourian, Laleh; Navarro, Diego; Lee, Sooyeon; Gordon, Alexis; Watkins, Matthew; Tigwell, Garreth W; Huenerfauth, Matt (September 2024, ACM Transactions on Accessible Computing)

People learning American Sign Language (ASL) and practicing their comprehension skills will often encounter complex ASL videos that may contain unfamiliar signs. Existing dictionary tools require users to isolate a single unknown sign before initiating a search by selecting linguistic properties or performing the sign in front of a webcam. This process presents challenges in extracting and reproducing unfamiliar signs, disrupting the video-watching experience, and requiring learners to rely on external dictionaries. We explore a technology that allows users to select and view dictionary results for one or more unfamiliar signs while watching a video. We interviewed 14 ASL learners to understand their challenges in understanding ASL videos, strategies for dealing with unfamiliar vocabulary, and expectations for anin situdictionary system. We then conducted an in-depth analysis with eight learners to examine their interactions with a Wizard-of-Oz prototype during a video comprehension task. Finally, we conducted a comparative study with six additional ASL learners to evaluate the speed, accuracy, and workload benefits of an embedded dictionary-search feature within a video player. Our tool outperformed a baseline in the form of an existing online dictionary across all three metrics. The integration of a search tool and span selection offered advantages for video comprehension. Our findings have implications for designers, computer vision researchers, and sign language educators.
more » « less
Full Text Available
Designing and Evaluating an Advanced Dance Video Comprehension Tool with In-situ Move Identification Capabilities

https://doi.org/10.1145/3613904.3642710

Hassan, Saad; De_Lacerda_Pataca, Caluã; Nourian, Laleh; Tigwell, Garreth W; Davis, Briana; Silver_Wagman, Will Zhenya (May 2024, ACM)

Analyzing dance moves and routines is a foundational step in learning dance. Videos are often utilized at this step, and advancements in machine learning, particularly in human-movement recognition, could further assist dance learners. We developed and evaluated a Wizard-of-Oz prototype of a video comprehension tool that offers automatic in-situ dance move identification functionality. Our system design was informed by an interview study involving 12 dancers to understand the challenges they face when trying to comprehend complex dance videos and taking notes. Subsequently, we conducted a within-subject study with 8 Cuban salsa dancers to identify the benefits of our system compared to an existing traditional feature-based search system. We found that the quality of notes taken by participants improved when using our tool, and they reported a lower workload. Based on participants’ interactions with our system, we offer recommendations on how an AI-powered span-search feature can enhance dance video comprehension tools.
more » « less
Full Text Available
Sign Spotter: Design and Initial Evaluation of an Automatic Video-Based American Sign Language Dictionary System

https://doi.org/10.1145/3597638.3614497

Bohacek, Matyas; Hassan, Saad (October 2023, ACM)

Searching unfamiliar American Sign Language (ASL) words in a dictionary is challenging for learners, as it involves recalling signs from memory and providing specific linguistic details. Fortunately, the emergence of sign-recognition technology will soon enable users to search by submitting a video of themselves performing the word. Although previous research has independently addressed algorithmic enhancements and design aspects of ASL dictionaries, there has been limited effort to integrate both. This paper presents the design of an end-to-end sign language dictionary system, incorporating design recommendations from recent human–computer interaction (HCI) research. Additionally, we share preliminary findings from an interview-based user study with four ASL learners.
more » « less
Full Text Available
Understanding How Deaf and Hard of Hearing Viewers Visually Explore Captioned Live TV News

https://doi.org/10.1145/3587281.3587287

Amin, Akhter Al; Hassan, Saad; Lee, Sooyeon; Huenerfauth, Matt (April 2023, Proceedings of the 20th International Web for All Conference)

Full Text Available
Modeling Word Importance in Conversational Transcripts: Toward improved live captioning for Deaf and hard of hearing viewers

https://doi.org/10.1145/3587281.3587290

Amin, Akhter Al; Hassan, Saad; Huenerfauth, Matt; Alm, Cecilia Ovesdotter (April 2023, Proceedings of the 20th International Web for All Conference)

Full Text Available
Exploring the Design Space of Automatically Generated Emotive Captions for Deaf or Hard of Hearing Users

https://doi.org/10.1145/3544549.3585880

Hassan, Saad; Ding, Yao; Kerure, Agneya Abhimanyu; Miller, Christi; Burnett, John; Biondo, Emily; Gilbert, Brenden (April 2023, Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (CHI EA '23))

Caption text conveys salient auditory information to deaf or hard-of-hearing (DHH) viewers. However, the emotional information within the speech is not captured. We developed three emotive captioning schemas that map the output of audio-based emotion detection models to expressive caption text that can convey underlying emotions. The three schemas used typographic changes to the text, color changes, or both. Next, we designed a Unity framework to implement these schemas and used it to generate stimuli videos. In an experimental evaluation with 28 DHH viewers, we compared DHH viewers’ ability to understand emotions and their subjective judgments across the three captioning schemas. We found no significant difference in participants’ ability to understand the emotion based on the captions or their subjective preference ratings. Open-ended feedback revealed factors contributing to individual differences in preferences among the participants and challenges with automatically generated emotive captions that motivate future work.
more » « less
Full Text Available
Designing and Experimentally Evaluating a Video-based American Sign Language Look-up System

https://doi.org/10.1145/3498366.3505804

Hassan, Saad (March 2022, ACM SIGIR Conference on Human Information Interaction and Retrieval)

Despite some prior research and commercial systems, if someone sees an unfamiliar American Sign Language (ASL) word and wishes to look up its meaning in a dictionary, this remains a difficult task. There is no standard label a user can type to search for a sign, and formulating a query based on linguistic properties is challenging for students learning ASL. Advances in sign-language recognition technology will soon enable the design of a search system for ASL word look-up in dictionaries, by allowing users to generate a query by submitting a video of themselves performing the word they believe they encountered somewhere. Users would then view a results list of video clips or animations, to seek the desired word. In this research, we are investigating the usability of such a proposed system, a webcam-based ASL dictionary system, using a Wizard-of-Oz prototype and enhanced the design so that it can support sign language word look-up even when the performance of the underlying sign-recognition technology is low. We have also investigated the requirements of students learning ASL in regard to how results should be displayed and how a system could enable them to filter the results of the initial query, to aid in their search for a desired word. We compared users’ satisfaction when using a system with or without post-query filtering capabilities. We discuss our upcoming study to investigate users’ experience with a working prototype based on actual sign-recognition technology that is being designed. Finally, we discuss extensions of this work to the context of users searching datasets of videos of other human movements, e.g. dance moves, or when searching for words in other languages.
more » « less
Full Text Available

« Prev Next »

Search for: All records